SemanticScuttle - klotz.me » Tags: llm+machine learning+fine tuning

Tags: llm* + machine learning* + fine tuning*

0 bookmark(s) - Sort by: Date ↓ / Title /

The article introduces a new approach to language modeling called test-time scaling, which enhances performance by utilizing additional compute resources during testing. The authors present a method involving a curated dataset and a technique called budget forcing to control compute usage, allowing models to double-check answers and improve reasoning. The approach is demonstrated with the Qwen2.5-32B-Instruct language model, showing significant improvements on competition math questions.

2025-02-14 Tags: arxiv, test-time scaling, budget forcing, llm, qwen2.5-32b-instruct, sft, fine tuning, reinforcement learning, machine learning, deepseek-r1 by klotz

mistral-finetune - GitHub

A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.

2024-06-06 Tags: github, mistral, lora, python, machine learning, fine tuning, llm by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: llm* + machine learning* + fine tuning*

Linked Tags

Related Tags